In contrast to the control-theoretic methods, the lack of stability guarantee remains a significant problem for model-free reinforcement learning (RL) methods. Jointly learning a policy and a Lyapunov function has recently become a promising approach to ensuring the whole system with a stability guarantee. However, the classical Lyapunov constraints researchers introduced cannot stabilize the system during the sampling-based optimization. Therefore, we propose the Adaptive Stability Certification (ASC), making the system reach sampling-based stability. Because the ASC condition can search for the optimal policy heuristically, we design the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm based on the ASC condition. Meanwhile, our algorithm avoids the optimization problem that a variety of constraints are coupled into the objective in current approaches. When evaluated on ten robotic tasks, our method achieves lower accumulated cost and fewer stability constraint violations than previous studies.
translated by 谷歌翻译
强化学习方法作为一种有前途的技术在自由浮动太空机器人的运动计划中取得了卓越的成果。但是,由于计划维度的增加和系统动态耦合的加剧,双臂自由浮动太空机器人的运动计划仍然是一个开放的挑战。特别是,由于缺乏最终效果的姿势约束,当前的研究无法处理捕获非合作对象的任务。为了解决该问题,我们提出了一种新型算法,即有效的算法,以促进基于RL的方法有效提高计划准确性。我们的核心贡献是通过先验知识指导构建一项混合政策,并引入无限规范以构建更合理的奖励功能。此外,我们的方法成功地捕获了具有不同旋转速度的旋转对象。
translated by 谷歌翻译
从大脑活动中解码图像一直是一个挑战。由于深度学习的发展,有可用的工具可以解决此问题。解码图像旨在将神经尖峰列车映射到低级视觉特征和高级语义信息空间。最近,有一些关于从尖峰列车解码的研究,但是,这些研究更少关注神经科学的基础,很少有研究将接受场合并为视觉图像重建。在本文中,我们提出了一种具有生物学特性的深度学习神经网络体系结构,以从尖峰火车中重建视觉图像。据我们所知,我们实施了一种将接收场属性矩阵集成到损失函数中的方法。我们的模型是从神经尖峰火车到图像的端到端解码器。我们不仅将Gabor过滤器合并到自动编码器中,该自动编码器用于生成图像,还提出了具有接收场特性的损失函数。我们在两个数据集上评估了我们的解码器,这些数据集包含猕猴的一级视觉皮层神经尖峰和sal虫视网膜神经节细胞(RGC)峰值。我们的结果表明,我们的方法可以有效地结合感受的特征以重建图像,从而根据神经信息提供一种新的视觉重建方法。
translated by 谷歌翻译
近年来,太空中出现了不合作的物体,例如失败的卫星和太空垃圾。这些对象通常由自由浮动双臂空间操纵器操作或收集。由于消除了建模和手动参数调整的困难,强化学习(RL)方法在空间操纵器的轨迹计划中表现出了更有希望的标志。尽管以前的研究证明了它们的有效性,但不能应用于跟踪旋转未知(非合作对象)的动态靶标。在本文中,我们提出了一个学习系统,用于将自由浮动双臂空间操纵器(FFDASM)的运动计划朝向非合作对象。具体而言,我们的方法由两个模块组成。模块I意识到了大型目标空间内两个最终效应的多目标轨迹计划。接下来,模块II将非合件对象的点云作为输入来估计运动属性,然后可以预测目标点在非合作对象上的位置。我们利用模块I和模块II的组合来成功地跟踪具有未知规律性的旋转对象上的目标点。此外,实验还证明了我们学习系统的可扩展性和概括。
translated by 谷歌翻译
由于其二次复杂性,是变压器中的关注模块,其是变压器中的重要组件不能高效地扩展到长序列。许多工作侧重于近似于尺寸的圆点 - 指数的软MAX功能,导致分二次甚至线性复杂性变压器架构。但是,我们表明这些方法不能应用于超出点的指数样式的更强大的注意模块,例如,具有相对位置编码(RPE)的变压器。由于在许多最先进的模型中,相对位置编码被用作默认,设计可以包含RPE的高效变压器是吸引人的。在本文中,我们提出了一种新颖的方法来加速对RPE的转化仪的关注计算在核心化的关注之上。基于观察到相对位置编码形成Toeplitz矩阵,我们数在数学上表明,可以使用快速傅里叶变换(FFT)有效地计算具有RPE的核化注意。使用FFT,我们的方法实现$ \ mathcal {o}(n \ log n)$时间复杂性。有趣的是,我们进一步证明使用相对位置编码适当地可以减轻香草群关注的培训不稳定问题。在广泛的任务上,我们经验证明我们的模型可以从头开始培训,没有任何优化问题。学习模型比许多高效的变压器变体更好地执行,并且在长序列制度中比标准变压器更快。
translated by 谷歌翻译
变压器架构已成为许多域中的主导选择,例如自然语言处理和计算机视觉。然而,与主流GNN变体相比,它对图形水平预测的流行排行榜没有竞争表现。因此,它仍然是一个谜,变形金机如何对图形表示学习表现良好。在本文中,我们通过提出了基于标准变压器架构构建的Gragemer来解决这一神秘性,并且可以在广泛的图形表示学习任务中获得优异的结果,特别是在最近的OGB大规模挑战上。我们在图中利用变压器的关键洞察是有效地将图形的结构信息有效地编码到模型中。为此,我们提出了几种简单但有效的结构编码方法,以帮助Gramemormer更好的模型图形结构数据。此外,我们在数学上表征了Gramemormer的表现力,并展示了我们编码图形结构信息的方式,许多流行的GNN变体都可以被涵盖为GrameRormer的特殊情况。
translated by 谷歌翻译
Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译